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Abstract 

This paper treats the problem of detecting periodicity in a sequence of photon arrival times, 
which occurs, for example, in attempting to detect gamma-ray pulsars. A particular focus is on 
how auxiliary information, typically source intensity, background intensity, and incidence angles 
and energies associated with each photon arrival should be used to maximize the detection 
power. We construct a class of likelihood-based tests, score tests, which give rise to event 
weighting in a principled and natural way, and derive expressions quantifying the power of the 
tests. These results can be used to compare the efficacies of different weight functions, including 
cuts in energy and incidence angle. The test is targeted toward a template for the periodic 
lightcurve, and we quantify how deviation from that template affects the power of detection. 

1 Introduction 

From a sequence of photon arrival times < t\ < t2 < • • • ijv < T, we wish to test the hypothesis 
that some of the photons come from a periodic source (for example, a gamma-ray pulsar) versus the 
null hypothesis that they come from a background plus a source that does not vary in time. The 
background emission rate is assumed to be constant in time. Associated with each event is auxiliary 
information, such as the incidence angle and the measured energy; we denote these variables by 
z. Ignoring this information is clearly wasteful, and in fact it would typically be used, at least 
in the form of cuts in energy and incidence angle. The value of z associated with an event (an 
arrival) provides information about the relative likelihood that photon was from the source or the 
background, and it seems intuitively that the event should be correspondingly weighted in some 
manner. (Note that cuts corresponds to weights that are zero or one.) A main thrust of this paper 
is to derive in a principled way how this information can best be used to enhance detection power. 
We derive expressions which quantify the efficiency of any weighting function and the form of the 
optimal function. 

Unless the periodic light curve is known, there is no universally optimal test, since a test that 
is most powerful against one light curve will not be most powerful against another. This statement 
also applies to tests that attempt to adapt to the form of the lightcurve. Any test implicitly or 
explicitly commits to a finite dimensional class of targets. Generally, the light curve of the source 
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is unknown, so we consider testing against a template, a probability density function vo(t) on [0, 1], 
extended periodically, with Fourier series 

^(i) = l + r?J> n e™ (1) 

for T] > 0. If i] = 0, the source intensity is constant in time. Defining v T (t) = i>o(t + r) 

u T (t) = 1 + r, ^ a n e 2mnt+2ninr , (2) 

n^O 

We model the arrival times as the superposition of independent background and source processes, 
a Poisson process with rate function 

\(t\9,T,n,f) = nc(t)[(l-9) + eu T (<t>(t))], 0<9<1 (3) 

where c{t) denotes the sensitivity of the instrument at time t. Here 9 is the proportion of flux from 
source; the phase function is 4>{t) = ft, or if drift is taken into consideration, 4>{t) = ft + ft 2 /2. 
Within this framework, different hypotheses can be tested. We focus on testing the null hypothesis 
H : T) = versus the alternative K : r] > 0. That is, we are concerned with a situation in which 
the presence of a source is not in doubt, but its periodicity is in question. Testing whether there is 
any source at all corresponds to testing H2 : 9 = against K2 : 9 > 0. 

This paper extends some results of Bickel et al. (2007), with more extensive considerations of 
event weighting. In the next section we derive a test which makes use of the information contained 
in both the arrival times, tj, and the associated variables, Zj, in a principled way, by appropriately 
weighting the arrival times. In Section 3, we show how the detection power of the test depends 
on the weights. Expressions derived there allow comparison of power when ideal weights are used 
and using approximate weights, such as simple cuts. We will also see the price paid for mismatch 
between the template and the actual light curve and for mismatch of the specified frequency and 
the actual frequency. Section 4 contains some illustrative examples. Some technical details are 
deferred to an Appendix. 



2 Score test 

Let /b(z) denote the probability density function of z for a background event and fs(z) the density 
function for a source event. We base a test on the likelihood function, assuming that the Zj are 
independent of the arrival times: 

N T 

L = (i N H c(^)[(l - 0)f B (zj) + 9f s {z j )v T m j ))\ exp ( - W c(t)[(l - 9) + 9v T (4>{t))]dt) (4) 
j=l J o 

A score test (Lehman and Romano, 2006) of H versus K is formed by evaluating the derivative of 
the log likelihood at 77 = 0: 

- 1 ( (1 _ w'ffij efs(zj) - ')) - * jf *>M«'» - H* (5) 
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If 4>(T) S> 1 and c{t) varies slowly and is nonzero over a substantial fraction of [0,T], the second 
term is neglible. We will make this assumption throughout. 

Let 

w . = (6) 

3 (i-e)f B ( Zj ) + ef s (z 3 ) {> 

This is the probability that photon j is from the source, given Zj. For a very weak source (small 
9), an approximation to ^ gives wj oc fs( z j)/fB(zj)- H z = (E,ip), energy and incidence angle, 
we can write 

f B (z) = f B (E)f B (<p\E) (7) 

f s (z) = fs(E)f s (<p\E) (8) 

w(z) = w(E)w(ip\E) (9) 

The optimal weight function is then 

(„ ef s (E)f s (<p\E) 

° pU ' V) ~ 6fs(E)f s (v\E) + (1 - 9)f B (E)f B &\E) [W) 

For a weak source (small 9), we have the approximation 

fs(E)fsME) 

ME) M <P\E) <U) 

The function fs(<p\E) is the point spread function of incidence angle at energy E. The background 
would normally be assumed to be spatially uniform, from which f B (ip\E) would follow. The optimal 
weight function also depends on the ratio of the energy spectra of source to background, which 
potentially provides valuable information, but might be unknown in practice. In the latter case 
one could use a weight function, 

^-iMmWw) (12) 

or for small 9 

fs(<p\E) . . 

W ^ = 1ME) (13) 

The test statistic ^ depends on the data through 

N N 

j=l j=l n^O 

= ^2 a n A ne 2mnT (15) 

where A n = ^ ■ Wj exp(2mn^)(tj)). To eliminate the dependence of the test statistic on the phase, 
r, we use Jq \S(t)\ 2 cIt. By Parseval's theorem 

j | J2»nA n e^ inT \ 2 dT = J2 Wn\ 2 \An\ 2 (16) 
, ' n^O n^O 
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Unless the source is weak, the statistic depends upon 0, which may be approximately known from 
other analyses, or may be estimated by maximum likelihood under the null hypothesis. In the 
latter case, the log likelihood is 

N 

1(0) = Nlog f i + Y J logc(t J ) + 




The log likelihood depends on only through the third term, which can be easily maximized 
numerically, if /b(^) and fs(z) are known. The final test statistic either uses the value of known 
a priori or the maximum likelihood estimate: 

QT=^K| 2 |An| 2 (18) 

The score test is an attractive alternative to a generalized likelihood ratio test. To compute the 
likelihood ratio test, the likelihood Q would have to be maximized both under H and K, and the 
latter would entail estimating the parameters 0, ij and r. 

Beran (1969) showed that this test, in an unweighted form, was locally most powerful invariant 
for testing uniformity of a distribution on the circle. In the case \a n \ = 0, n > 1 and Wj = 1, this 
is Rayleigh's test (Rayleigh, 1919). If \a n \ = 1, n < m and \a n \ =0, n > m and Wj = 1, this is 
the test of Buccheri et al. (1983). De Jager et al. (1989) proposed the H-test, which chooses 
m adaptively. 

We now consider the distribution of Qt when there is no periodicity (77 = 0). Let f3\ = 
J w(z)fB(z)dz and Ci = / w(z)fs(z)dz be the expected values of the weight of background and 
source events and let fi% = J w 2 (z)fB(z)dz and C2 = f w 2 (z) fs(z)dz. The average value of a weight 
is E(W) = (l-0)/?i + 0Ci and E(W 2 ) = (l-0)f3 2 + 9( 2 . Let /i = fiT' 1 c(t)dt. In the Appendix 
we argue that 

e h Qt *l [{i-e)(3 2 + ec 2 ]^Y,\ a n\ 2 (is) 

Var H (Q T ) ~ [(l-9)p 2 + e( 2 } 2 f i 2 Y,\ a ^ ( 20 ) 

Also 2|^4„| 2 /(/io7 1 [(l — 9)f3 2 + 9(^2]) has approximately a chi-square distribution with two degrees of 
freedom. The A n are approximately independent so that Qt approximately has the distribution of 
a weighted sum of independent chi-square random variables. The scaling of the chi-square random 
variables can be estimated as follows: observe that since ^jlqT is the expected number of events in 
[0,T], Y,w 2 ~ fi TE{W 2 ). Thus 

0oT[(l - 0)/& + 6K2] ( 21 ) 

j 
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3 Power 



We next consider properties of the test statistic Qt when there is a periodic source, i.e. r\ > 0. Let 
the pulse shape of the source be 

l(t) = J2^ int (22) 

As an indication of the detection power of the test, we can use the signal to noise ratio. Let 
Eh(Qt) and Ek(Qt) respectively denote the expected values of the test statistic Qt when there 
is and is not a periodic component, and let an denote the standard deviation of Qt under the null 
hypothesis of no periodic component. If the phase function cp(t) is correctly identified (e.g. if / 
and / are correctly specified) we show in the Appendix that 



e k (Qt) - E H (Q T ) a2rr ef .£ 
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-gVM 7 4112 (23) 

Here the efficiency of the weighting function enters as 

Ft \ Ci 2 [E(W\ source)] 2 

£{w) = (i-e)p 2 + et 2 = eW) (24) 

This expression holds for any weight function. Since a weight function need only be defined up 
to a constant of proportionality, the denominator provides a normalization. The optimal weight 
function is that given by the score test, ([6]), in which case it follows from a short calculation that 

g (^-j/ (i-^wl W ,) ^ < 25 > 

which is the ratio of the average probability of a source event given z to the marginal probability of 
a source event. The efficacy of weighting depends in this way on the degree to which z discriminates 
between background and source, or on how correlated it is with the optimal weight function, since 
after some algebra, 

p . , [£(^opt)] 2 

where the expectations are taken with respect to the marginal density of Z, (1 — 0)/b{z) + Ofs(z). 
In the case of no weighting, w(z) = 1, £ = 1. 



From (23), the detection threshold for a weak signal is 9 of the order T -1 / 2 . The expression 
also quantifies how the power depends upon the match of the template {|a n | 2 } to the source profile 
{ 1 7n | 2 } • Maximal power is achieved when \a n \ 2 cx |7 n | 2 . So for detection of periodicity of a given 
source, the best detection-statistic has the same spectrum as the source in question. Because the 
latter is unknown, a template could be based on known sources (see Section 4 for an example). 

This result assumes that <fr(t) is very accurately specified. In the case 4>(t) = fat + /ot 2 /2, and 
approximate values are used, / = /o + A/T and / = /o + A/T 2 , A < 1, the sum in the numerator of 



(23) becomes £ n ^ l7n| 2 |«n| 2 (l — 0((nA) 2 )). Thus, accurate specification is especially important 
for higher harmonics to contribute to the power. This depends on the rate of decay of j n and on 
that of a n , which for practical reasons would be zero for sufficiently large n. 
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Light Curves 



Normalized Fourier Coefficients 




Figure 1: Left panel: smoothed light curves, computed from single EGRET viewing periods, for 
the Crab, Geminga, and Vela pulsars. Right panel: normalized coefficients \A n \ 2 for each pulsar. 



4 Examples 
4.1 Template 

To illustrate the effect of the template choice, {|a n | 2 }, we phased photon arrival times from Crab, 
Geminga, and Vela for single EGRET viewing periods. We calculated the corresponding coefficients, 
\A n \ 2 (with no weighting). For pedagogical illustration, we normalized them and regard them as 



the coefficients \j n \ (22) of the sources. These coefficients are plotted in Figure 1. It is interesting 



that in all cases the coefficient I72I 2 is largest. 

The template {|a n | 2 } will be most powerful for a particular source if \a n \ 2 oc |7 n | 2 , which is 
of course not possible in practice. To illustrate the effects of suboptimal templates, we evaluated 
percent efficiency for sequences \a n \ = 1, n < m and \a n \ = 0, n > m, for m = 1, 2, . . . , 10 (the 



test). (By "efficiency" we mean the percentage of the signal to noise ratio (23) that is attained 
relative to that attained by the optimal template, \a n \ 2 oc |7n| 2 -) The results are displayed in 
Table 1. As would be expected from Figure 1, the efficiency increases initially with m, and then 
decreases. Considerable gains in power would result from using two to five harmonics, since the 
signal to noise ratios increase by factors of two to three. For example, one would expect that a 
8.9<r result using the first three coefficients for Crab would only be a 2.8a result using the Rayleigh 
test. We also experimented with using the average of the three sources as a template, cutting off 
after five and ten terms. Those results are shown in Table 2. (The first five average coefficients 



6 



are 0.35, 0.77, 0.43, 0.17, 0.26.) Very little is gained in going from five to ten non-zero coefficients, 
and the computational savings would be substantial, since we would need |nA| < 1 for the highest 
harmonic. For example, if one were using ten harmonics the "natural" Fourier frequencies k/T 
would have to be oversampled by a factor of at least ten and all ten harmonics would have to be 
calculated. 



Table 1: Relative efficiencies for m = 1,2,. ..,10. 



number of coefficients 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


Crab 


28 


65 


89 


83 


88 


84 


80 


78 


74 


70 


Geminga 


23 


82 


67 


71 


66 


63 


59 


57 


54 


52 


Vela 


42 


67 


86 


79 


87 


80 


84 


79 


79 


77 



Table 2: Relative efficiencies obtained from using the first five and first ten average coefficients as 
the template. 



number of terms 


5 


10 


Crab 


96 


97 


Geminga 


85 


85 


Vela 


89 


93 



4.2 Weight function 

Consider a source which emits photons at rate a and a background whose rate is p per unit area and 
suppose that photons are collected in a disc of radius R (rather than a spherical cap, for simplicity). 
Then 

H = TrR 2 p + a (27) 

= (28) 

irR 2 p + a v ' 

SbW) = % 0<<P<R (29) 

Let (3 = 2np/a, a measure of the strength of the background relative to the source. Then the 
denominator of (124]) is 



E ( W ^ = P2 " [ / ™ 2 ( E )fs(E) [ w 2 (v\E)fs(<p\E)d<pdE + 
TrR 2 p + alJ J 

13 j w 2 {E)f B {E) j ^w(v\E)d^dE (30) 



The factor a/(irR p + a) when combined with the factor 9 po in (23) is proportional to a. The 
optimal weight function is then 



fs(E)f s &\E) 
fs(E)f s ^\E) + (3<pf B (E) 
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which depends on the energy spectra through their ratio. 



If the psf is bivariate circular Gaussian with standard deviation a(E), then tp, the distance to 
the origin, has the probability density function 

ftM£) = ^exp(-^) (32) 

(This assumes that a{E) <C R, otherwise the density truncated at R has to be normalized to have 
unit area.) Then the optimal weight function is 

W ° pm *° = fs(E)+Pa(E)eM^/^(E))f B (E) (33) 

If photons are not differentially weighted according to the ratio of the energy spectra, one has the 
weight function 

W{E ' & = l + P*(E)eM^/2a*(E)) (34) 

The decay of this weight function depends on the parameter £ = /3a(E). If this parameter is 
very large (weak source/strong background/large a), w(tp) oc exp(— (ip 2 /2a 2 (E))). Numerical ex- 
ploration shows that there is little difference among the functions for £ > 1, but if £ = 0.1 and 
£ = 0.01, the weights decay substantially more slowly. For example, if £ = 1 a 2a incidence angle 
is given weight (relative to that of a photon that is directly on source) of about 0.1 and a 3a angle 
is given approximately zero weight. In comparison, for £ = 0.01 a 2a angle receives weight about 
0.9, a 3a angle receives weight about 0.5 and a 5a angle receives weight approximately 0. 



5 Conclusion and Discussion 

We have presented a class of tests that depend on two features: a template for the form of the 
periodic light curve and a function that differentially weights arrival times. We have suggested using 
a template constructed as the average of those of known sources, but one could choose the template 
adaptively, for example by considering the maximum of the test statistic over the light curves from 
the known sources. The power of such a test would be more difficult to analyze explicitly, as would 
be the power of the H-test. From general theory we know that any test will not be uniformly most 
powerful, but will perform better in certain "directions" than others. Janssen (2000) shows that 
in testing for uniformity any test can achieve high power for at most a finite dimensional family of 
alternatives. This can be seen quite clearly in expressions we have developed to quantify the power 



of the test (23). 



Ideally, the weight given to a photon arrival should be proportional to the probability that the 
photon came from the source, given its measured energy, incidence angle, and any other available 
information. The optimal weight function can only be approximated in practice. It depends on the 
ratio of the energy spectra of the source and background, which may not be accurately known for 
a faint source. It depends, through fs(<p\E), on the source location, which may also be subject to 



uncertainty. The efficiency of any weight function, w(z), has the conceptually simple form (24) 



The score test was derived under some assumptions that may not strictly hold in practice. We 
assume that the photon arrival process is Poisson, which does not take into account instrument 
dead time following the arrival of a photon. We also assume that the distribution of z does not 
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depend on the arrival time. This does not take into account possible dependence between energy 
and the phase of the source (see Fierro et al. 1998) Nonetheless, the form of the statistic Qt is 
such that it is sensitive to periodic sources, even when the assumptions upon which it was derived 
do not strictly hold. 

The score test was derived to discriminate between a periodic source and background which 
is not time varying. From the nature of the construction, it is clear that a similar test could be 
derived to take into account a background intensity which varies in time in a known way, perhaps 
for example a known nearby pulsar. 
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7 Appendix 

Here we sketch arguments supporting the assertions about the distribution of the test statis- 
tic Qt under the null and alternative hypotheses. We assume that 4>(T) 3> 1 and that c(t) 
varies slowly and is nonzero over a substantial fraction of [0,T]. In particular we assume that 

Jq exp(2irin(f)(t))c(t)dt is negligible compared to J Q T c(t)dt, which is true, for example, if c(t) is 
constant. 



7.1 Null distribution 

Let W(t) = J2jLi Wj5(t — tj). Under the null, all events are background and 

E-^A n = E^= [ e 2vin<t>{t) dW(t) (35) 

(l-*)0L + *Cl f T e 2«in<t>(t) Xm (36) 



IT jo 

= F e 2«in<Kt) cm (37) 

vT Jo 

~ (38) 

The approximation holds under the assumptions above about <j)(t) and c{t). Similarly, the real and 
imaginary parts of T~ 1 l 2 A n are approximately uncorrelated. To calculate .E^nl 2 we use 

E[dW(t)dW(s)} = \(t)[(l-9)(3 2 + 6( 2 }6(s-t)dsdt 

+\{s)X(t)[e 2 Cf + 29(1 - 0)Ci/?i + (1 - Of^dsdt (39) 
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Then 



E\A r . 



T 

o Jo 



?™ n ^e- 2 ™ n ^E[dW{s)dW{t)] 



T 



[(1 - 9)132 + 0(2] / X(t)dt + [0(i + (1 - 0)ft. 



t 2 

2irin(f>(t) ■ 



>X(t)dt 



(40) 
(41) 



The first term is dominant. The limiting chi-squared approximation follows from a central limit 
theorem argument about the distribution of the linear statistic T~ l / 2 A n . 



7.2 Power 



To evaluate EkQt we need to calculate E[dW (t)dW (s)}. First, for s = t, the event is either with 
probability 9 from source or with probability (1 — 9) from background. Thus 



E[dW(s)dW(t)} = nc(t)[9&l{<l){t)) + (1 - 9)(3 2 }dt, s 



(42) 



For s ^ t there are three possibilities: both events are from source, both are from background, or 
one is from source and one is from background. 



E[dW(t)dW(s)) 



v 2 c(s)c(t)[9 2 chm)h(Hs)) + (1 - of Pi + 

9(1 - 0)Ci/?i(7(«Ks)) + l(<f>(t))]dsdt, s + t 



(43) 



We initially assume that the phase 4>(t) is properly specified, i.e. that / and / are identified. 
-E|j4 n | 2 contains contributions of all the terms in (42) and (43). Some analysis shows that the 



leading order comes from the first term in (43), leading to 

2 



2q2 



e~ 2nin ^c(t)j(cb(t))dt 



f lkc(t)e~ 2 ™^e 2 ™ k ^dt 
k ^ 

/^ 2 |7n| 2 [ f <t)dt 



Thus, under the alternative 



77. m 2/)2a2[Jo c(t)dt] 2 ■s-^ 2 

EkQt ^ (J. 9 Ci rp 2^ ' 7n ' l an ' 



(44) 
(45) 

(46) 



The approximation for frequency misspecification, A / 0, follows from Taylor series expansions, 
noting that the first derivatives vanish at A = 0, since that point is a maximum. 
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